maxout network
- North America > Canada > Ontario > Toronto (0.14)
- Asia > China > Zhejiang Province > Hangzhou (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (2 more...)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > Germany > Saxony > Leipzig (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- (4 more...)
- North America > Canada > Ontario > Toronto (0.14)
- Asia > China > Zhejiang Province > Hangzhou (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (2 more...)
Appendix
The appendix is organized as follows. Appendix A Proofs related to activation patterns and activation regions. Appendix B Proofs related to the numbers of regions attained with positive probability. Appendix D Proofs related to the expected volume of activation regions. Appendix E Proofs related to the expected number of activation regions.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Germany > Saxony > Leipzig (0.04)
- (5 more...)
Sparse Hybrid Linear-Morphological Networks
Fotopoulos, Konstantinos, Garoufis, Christos, Maragos, Petros
We investigate hybrid linear-morphological networks. Recent studies highlight the inherent affinity of morphological layers to pruning, but also their difficulty in training. We propose a hybrid network structure, wherein morphological layers are inserted between the linear layers of the network, in place of activation functions. We experiment with the following morphological layers: 1) maxout pooling layers (as a special case of a morphological layer), 2) fully connected dense morphological layers, and 3) a novel, sparsely initialized variant of (2). We conduct experiments on the Magna-Tag-A-Tune (music auto-tagging) and CIFAR-10 (image classification) datasets, replacing the linear classification heads of state-of-the-art convolutional network architectures with our proposed network structure for the various morphological layers. We demonstrate that these networks induce sparsity to their linear layers, making them more prunable under L1 unstructured pruning. We also show that on MTAT our proposed sparsely initialized layer achieves slightly better performance than ReLU, maxout, and densely initialized max-plus layers, and exhibits faster initial convergence.
- North America > Canada > Ontario > Toronto (0.14)
- Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
- North America > United States > New York (0.04)
- (2 more...)
On the Number of Linear Regions of Deep Neural Networks
Guido F. Montufar, Razvan Pascanu, Kyunghyun Cho, Yoshua Bengio
We study the complexity of functions computable by deep feedforward neural networks with piecewise linear activations in terms of the symmetries and the number of linear regions that they have. Deep networks are able to sequentially map portions of each layer's input-space to the same output. In this way, deep models compute functions that react equally to complicated patterns of different inputs. The compositional structure of these functions enables them to re-use pieces of computation exponentially often in terms of the network's depth. This paper investigates the complexity of such compositional maps and contributes new theoretical results regarding the advantage of depth for neural networks with piecewise linear activation functions. In particular, our analysis is not specific to a single family of models, and as an example, we employ it for rectifier and maxout networks. We improve complexity bounds from pre-existing work and investigate the behavior of units in higher layers.
- North America > Canada > Quebec > Montreal (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)